Exploratory HJB Equations and Their Convergence
نویسندگان
چکیده
We study the exploratory Hamilton–Jacobi–Bellman (HJB) equation arising from entropy-regularized control problem, which was formulated by Wang, Zariphopoulou, and Zhou (J. Mach. Learn. Res., 21 (2020), 198) in context of reinforcement learning continuous time space. establish well-posedness regularity viscosity solution to equation, as well convergence problem classical stochastic when level exploration decays zero. then apply general results obtained temperature introduced Gao, Xu, (SIAM J. Control Optim., 60 (2022), pp. 1250–1268) design an endogenous schedule for simulated annealing nonconvex optimization. derive explicit rate this diminishes zero, find that stationary distribution optimally controlled process exists, is however neither a Dirac mass on global optimum nor Gibbs measure.
منابع مشابه
Nonlinear HJB Equations
This paper is concerned with the standard finite element approximation of HamiltonJacobi-Bellman Equations (HJB) with nonlinear source terms. Under a realistic condition on the nonlinearity, we characterize the discrete solution as a fixed point of a contraction. As a result of this, we also derive a sharp L∞error estimate of the approximation. Mathematics Subject Classification: Primary 35F21;...
متن کاملHjb Equations for Certain Singularly Controlled Diffusions
over the admissible controls U . Both g and κ · u (u ∈ U) may take positive and negative values. This paper studies the corresponding dynamic programming equation (DPE), a second-order degenerate elliptic partial differential equation of HJB-type with a state constraint boundary condition. Under the controllability condition GU = R and the finiteness of H(q) = supu∈U1{−Gu · q− κ · u}, q ∈ R , w...
متن کاملA New Scheme for Discrete HJB Equations
In this paper we propose a relaxation scheme for solving discrete HJB equations based on scheme II [1] of Lions and Mercier. The convergence of the new scheme has been established. Numerical example shows that the scheme is efficient.
متن کاملAsymptotic Analysis of Forward Performance Processes in Incomplete Markets and Their Ill-Posed HJB Equations
We consider the problem of optimal portfolio selection under forward investment performance criteria in an incomplete market. The dynamics of the prices of the traded assets depend on a pair of stochastic factors, namely, a slow factor (e.g. a macroeconomic indicator) and a fast factor (e.g. stochastic volatility). We analyze the associated forward performance SPDE and provide explicit formulae...
متن کاملPathwise Stochastic Control Problems and Stochastic HJB Equations
In this paper we study a class of pathwise stochastic control problems in which the optimality is allowed to depend on the paths of exogenous noise (or information). Such a phenomenon can be illustrated by considering a particular investor who wants to take advantage of certain extra information but in a completely legal manner. We show that such a control problem may not even have a “minimizin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Siam Journal on Control and Optimization
سال: 2022
ISSN: ['0363-0129', '1095-7138']
DOI: https://doi.org/10.1137/21m1448185